Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libpod: run conmon from the persist directory #23243

Closed

Conversation

giuseppe
Copy link
Member

@giuseppe giuseppe commented Jul 10, 2024

conmon creates a "oom" file inside the current working directory when an OOM event happens in the cgroup. Run conmon from the persist directory so it doesn't leak files in the directory where Podman runs.

Does this PR introduce a user-facing change?

None

@giuseppe giuseppe added the No New Tests Allow PR to proceed without adding regression tests label Jul 10, 2024
@openshift-ci openshift-ci bot added the do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None label Jul 10, 2024
Copy link
Contributor

openshift-ci bot commented Jul 10, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added release-note-none approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/release-note-label-needed Enforce release-note requirement, even if just None labels Jul 10, 2024
@Luap99
Copy link
Member

Luap99 commented Jul 10, 2024

My only fear would be that we have something using relatives paths and then the cleanup process uses the wrong path. I know this should not be the case but I don't see how we could prevent/test for such issues realistically so I wanted to avoid changing the cwd

@giuseppe
Copy link
Member Author

My only fear would be that we have something using relatives paths and then the cleanup process uses the wrong path. I know this should not be the case but I don't see how we could prevent/test for such issues realistically so I wanted to avoid changing the cwd

that would make things less deterministic though. What would happen if we run podman from a random directory that then could be removed? e.g. mkdir /tmp/foo; cd /tmp/foo; podman run -d ...; cd /; rmdir /tmp/foo should not affect the cleanup process IMO

@giuseppe
Copy link
Member Author

I am not sure what is better to pick between the bundle directory and the persist directory though. The OCI runtime uses the bundle directory too, while the persist directory is private to podman/conmon

conmon creates a "oom" file inside the current working directory when
an OOM event happens in the cgroup.  Run conmon from the persist
directory so it doesn't leak files in the directory where Podman runs.

Signed-off-by: Giuseppe Scrivano <[email protected]>
@giuseppe giuseppe force-pushed the run-conmon-from-bundle-directory branch from 2f96119 to 0f7550c Compare July 10, 2024 08:53
@giuseppe giuseppe changed the title libpod: run conmon from the bundle directory libpod: run conmon from the persist directory Jul 10, 2024
@giuseppe
Copy link
Member Author

I am not sure what is better to pick between the bundle directory and the persist directory though. The OCI runtime uses the bundle directory too, while the persist directory is private to podman/conmon

changed to run from the persist directory

@Luap99
Copy link
Member

Luap99 commented Jul 10, 2024

My only fear would be that we have something using relatives paths and then the cleanup process uses the wrong path. I know this should not be the case but I don't see how we could prevent/test for such issues realistically so I wanted to avoid changing the cwd

that would make things less deterministic though. What would happen if we run podman from a random directory that then could be removed? e.g. mkdir /tmp/foo; cd /tmp/foo; podman run -d ...; cd /; rmdir /tmp/foo should not affect the cleanup process IMO

Yes of course it should not matter, all I am saying it could matter depending on the args and general config used. I don't see where we practically guarantee that we never use relative paths as such I fear this is going to break things.

Copy link
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simple example is just to set CONTAINERS_CONF to a relative path the cleanup process will fail finding this file. Is it a good design that the cleanup process just fails? No.

But this is what we have today and this change causes a unknown amount of breaking changes so I don't think we can do this at all.

19908 faccessat(AT_FDCWD, "containers.conf", F_OK) = -1 ENOENT (No such file or directory)
19908 ioctl(2, TCGETS, 0xc00069f514)    = -1 ENOTTY (Inappropriate ioctl for device)
19908 write(2, "time=\"2024-07-10T11:10:32+02:00\" level=error msg=\"finding config on system: CONTAINERS_CONF file: faccessat containers.conf: no such file or directory\"\n", 152) = 152
19908 exit_group(1 <unfinished ...>

@giuseppe
Copy link
Member Author

sure if we want to support relative directories we cannot change the current working directory for conmon.

Should we rewrite the CONTAINERS_*_CONF env variables to be absolute paths or just give up on this change?

@Luap99
Copy link
Member

Luap99 commented Jul 10, 2024

Should we rewrite the CONTAINERS_*_CONF env variables to be absolute paths or just give up on this change?

Well that seems like a good idea in general but it will not fix my overall concern, it was just one example to show how this change can break things. Who knows what other relative paths we may use, to me this is a big unknown which we cannot test for. I really do not like to risk breaking people over this change.

@giuseppe
Copy link
Member Author

new attempt at solving it in conmon: containers/conmon#514

@giuseppe giuseppe closed this Jul 10, 2024
@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Oct 9, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Oct 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. No New Tests Allow PR to proceed without adding regression tests release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants